cognitiveclass.ai logo

Assignment: SQL Notebook for Peer Assignment

Estimated time needed: 60 minutes.

Introduction ΒΆ

Using this Python notebook you will:

  1. Understand the Spacex DataSet
  2. Load the dataset into the corresponding table in a Db2 database
  3. Execute SQL queries to answer assignment questions

Overview of the DataSet ΒΆ

SpaceX has gained worldwide attention for a series of historic milestones.

It is the only private company ever to return a spacecraft from low-earth orbit, which it first accomplished in December 2010. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars wheras other providers cost upward of 165 million dollars each, much of the savings is because Space X can reuse the first stage.

Therefore if we can determine if the first stage will land, we can determine the cost of a launch.

This information can be used if an alternate company wants to bid against SpaceX for a rocket launch.

This dataset includes a record for each payload carried during a SpaceX mission into outer space.

Download the datasets ΒΆ

This assignment requires you to load the spacex dataset.

In many cases the dataset to be analyzed is available as a .CSV (comma separated values) file, perhaps on the internet. Click on the link below to download and save the dataset (.CSV file):

Spacex DataSet

Store the dataset in database table ΒΆ

it is highly recommended to manually load the table using the database console LOAD tool in DB2 .

Now open the Db2 console, open the LOAD tool, Select / Drag the .CSV file for the dataset, Next create a New Table, and then follow the steps on-screen instructions to load the data. Name the new table as follows:

SPACEXDATASET

Follow these steps while using old DB2 UI which is having Open Console Screen

Note:While loading Spacex dataset, ensure that detect datatypes is disabled. Later click on the pencil icon(edit option).

  1. Change the Date Format by manually typing DD-MM-YYYY and timestamp format as DD-MM-YYYY HH\:MM:SS.

    Here you should place the cursor at Date field and manually type as DD-MM-YYYY.

  2. Change the PAYLOAD MASS\ _KG_ datatype to INTEGER.

Changes to be considered when having DB2 instance with the new UI having Go to UI screen

  • Refer to this insruction in this link for viewing the new Go to UI screen.

  • Later click on Data link(below SQL) in the Go to UI screen and click on Load Data tab.

  • Later browse for the downloaded spacex file.

  • Once done select the schema andload the file.

InΒ [1]:
!pip install sqlalchemy==1.3.9
!pip install ibm_db_sa
!pip install ipython-sql
Requirement already satisfied: sqlalchemy==1.3.9 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (1.3.9)
Requirement already satisfied: ibm_db_sa in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (0.3.7)
Requirement already satisfied: ibm-db>=2.0.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ibm_db_sa) (3.1.1)
Requirement already satisfied: sqlalchemy>=0.7.3 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ibm_db_sa) (1.3.9)
Requirement already satisfied: ipython-sql in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (0.4.0)
Requirement already satisfied: prettytable<1 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython-sql) (0.7.2)
Requirement already satisfied: six in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython-sql) (1.16.0)
Requirement already satisfied: ipython-genutils>=0.1.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython-sql) (0.2.0)
Requirement already satisfied: ipython>=1.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython-sql) (8.0.1)
Requirement already satisfied: sqlparse in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython-sql) (0.4.2)
Requirement already satisfied: sqlalchemy>=0.6.7 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython-sql) (1.3.9)
Requirement already satisfied: decorator in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (5.1.1)
Requirement already satisfied: pygments in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (2.11.2)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (3.0.28)
Requirement already satisfied: jedi>=0.16 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (0.18.1)
Requirement already satisfied: appnope in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (0.1.2)
Requirement already satisfied: traitlets>=5 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (5.1.1)
Requirement already satisfied: backcall in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (0.2.0)
Requirement already satisfied: pickleshare in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (0.7.5)
Requirement already satisfied: setuptools>=18.5 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (60.8.2)
Requirement already satisfied: matplotlib-inline in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (0.1.3)
Requirement already satisfied: black in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (22.1.0)
Requirement already satisfied: pexpect>4.3 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (4.8.0)
Requirement already satisfied: stack-data in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from ipython>=1.0->ipython-sql) (0.1.4)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from jedi>=0.16->ipython>=1.0->ipython-sql) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from pexpect>4.3->ipython>=1.0->ipython-sql) (0.7.0)
Requirement already satisfied: wcwidth in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=1.0->ipython-sql) (0.2.5)
Requirement already satisfied: pathspec>=0.9.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from black->ipython>=1.0->ipython-sql) (0.9.0)
Requirement already satisfied: typing-extensions>=3.10.0.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from black->ipython>=1.0->ipython-sql) (4.0.1)
Requirement already satisfied: click>=8.0.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from black->ipython>=1.0->ipython-sql) (8.0.3)
Requirement already satisfied: mypy-extensions>=0.4.3 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from black->ipython>=1.0->ipython-sql) (0.4.3)
Requirement already satisfied: tomli>=1.1.0 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from black->ipython>=1.0->ipython-sql) (2.0.1)
Requirement already satisfied: platformdirs>=2 in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from black->ipython>=1.0->ipython-sql) (2.5.0)
Requirement already satisfied: executing in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from stack-data->ipython>=1.0->ipython-sql) (0.8.2)
Requirement already satisfied: pure-eval in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from stack-data->ipython>=1.0->ipython-sql) (0.2.2)
Requirement already satisfied: asttokens in ./env_IBM_Applied_Data_Science_Capstone/lib/python3.8/site-packages (from stack-data->ipython>=1.0->ipython-sql) (2.0.5)

Connect to the database ΒΆ

Let us first load the SQL extension and establish a connection with the database

InΒ [2]:
%load_ext sql

DB2 magic in case of old UI service credentials.

In the next cell enter your db2 connection string. Recall you created Service Credentials for your Db2 instance before. From the uri field of your Db2 service credentials copy everything after db2:// (except the double quote at the end) and paste it in the cell below after ibm_db_sa://

in the following format

%sql ibm_db_sa://my-username:my-password\@my-hostname:my-port/my-db-name

DB2 magic in case of new UI service credentials.

  • Use the following format.

  • Add security=SSL at the end

%sql ibm_db_sa://my-username:my-password\@my-hostname:my-port/my-db-name?security=SSL

InΒ [Β ]:
 
InΒ [3]:
## CONNECTION:
# %sql ibm_db_sa://USER:PASSWORD@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset&security=SSL
InΒ [4]:
from secret import sql_user, sql_pass

%sql ibm_db_sa://{sql_user}:{sql_pass}@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset&security=SSL

Tasks ΒΆ

Now write and execute SQL queries to solve the assignment tasks.

Task 1 ΒΆ

Display the names of the unique launch sites in the space mission ΒΆ
InΒ [5]:
%%sql
SELECT DISTINCT launch_site FROM SPACEXTBL;
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[5]:
launch_site
CCAFS LC-40
CCAFS SLC-40
KSC LC-39A
VAFB SLC-4E

Task 2 ΒΆ

Display 5 records where launch sites begin with the string 'CCA' ΒΆ
InΒ [6]:
%%sql
SELECT * FROM SPACEXTBL WHERE launch_site LIKE 'CCA%' LIMIT 5;
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[6]:
DATE time__utc_ booster_version launch_site payload payload_mass__kg_ orbit customer mission_outcome landing__outcome
2010-06-04 18:45:00 F9 v1.0 B0003 CCAFS LC-40 Dragon Spacecraft Qualification Unit 0 LEO SpaceX Success Failure (parachute)
2010-12-08 15:43:00 F9 v1.0 B0004 CCAFS LC-40 Dragon demo flight C1, two CubeSats, barrel of Brouere cheese 0 LEO (ISS) NASA (COTS) NRO Success Failure (parachute)
2012-05-22 07:44:00 F9 v1.0 B0005 CCAFS LC-40 Dragon demo flight C2 525 LEO (ISS) NASA (COTS) Success No attempt
2012-10-08 00:35:00 F9 v1.0 B0006 CCAFS LC-40 SpaceX CRS-1 500 LEO (ISS) NASA (CRS) Success No attempt
2013-03-01 15:10:00 F9 v1.0 B0007 CCAFS LC-40 SpaceX CRS-2 677 LEO (ISS) NASA (CRS) Success No attempt

Task 3 ΒΆ

Display the total payload mass carried by boosters launched by NASA (CRS) ΒΆ
InΒ [7]:
%%sql
SELECT sum(payload_mass__kg_) AS "Total payload mass (NASA (CRS))" FROM SPACEXTBL WHERE customer = 'NASA (CRS)';
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[7]:
Total payload mass (NASA (CRS))
45596

Task 4 ΒΆ

Display average payload mass carried by booster version F9 v1.1 ΒΆ
InΒ [8]:
%%sql
SELECT AVG(payload_mass__kg_) AS "Average payload mass (booster version F9 v1.1)" FROM SPACEXTBL WHERE booster_version LIKE 'F9 v1.1%';
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[8]:
Average payload mass (booster version F9 v1.1)
2534

Task 5 ΒΆ

List the date when the first successful landing outcome in ground pad was acheived. ΒΆ

Hint:Use min function

InΒ [9]:
%%sql
SELECT min(DATE) AS "First successful landing outcome in ground pad" FROM SPACEXTBL WHERE landing__outcome = 'Success (ground pad)';
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[9]:
First successful landing outcome in ground pad
2015-12-22

Task 6 ΒΆ

List the names of the boosters which have success in drone ship and have payload mass greater than 4000 but less than 6000 ΒΆ
InΒ [10]:
%%sql
SELECT booster_version FROM SPACEXTBL WHERE landing__outcome = 'Success (drone ship)' AND payload_mass__kg_ BETWEEN 4000 AND 6000;
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[10]:
booster_version
F9 FT B1022
F9 FT B1026
F9 FT B1021.2
F9 FT B1031.2

Task 7 ΒΆ

List the total number of successful and failure mission outcomes ΒΆ
InΒ [11]:
%%sql
SELECT 'Success' AS "Outcome", count(*) AS "Count" FROM SPACEXTBL WHERE landing__outcome LIKE 'Success%'
UNION ALL
SELECT 'Failure' AS "Outcome", count(*) AS "Count" FROM SPACEXTBL WHERE landing__outcome NOT LIKE 'Success%'
UNION ALL
SELECT '(All)' AS "Outcome", count(*) AS "Count" FROM SPACEXTBL;
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[11]:
Outcome Count
Success 61
Failure 40
(All) 101

Task 8 ΒΆ

List the names of the booster_versions which have carried the maximum payload mass. Use a subquery ΒΆ
InΒ [12]:
%%sql
SELECT DISTINCT booster_version
FROM SPACEXTBL
WHERE payload_mass__kg_ = (
    SELECT max(payload_mass__kg_)
    FROM SPACEXTBL
)
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[12]:
booster_version
F9 B5 B1048.4
F9 B5 B1048.5
F9 B5 B1049.4
F9 B5 B1049.5
F9 B5 B1049.7
F9 B5 B1051.3
F9 B5 B1051.4
F9 B5 B1051.6
F9 B5 B1056.4
F9 B5 B1058.3
F9 B5 B1060.2
F9 B5 B1060.3

Task 9 ΒΆ

List the failed landing_outcomes in drone ship, their booster versions, and launch site names for in year 2015 ΒΆ
InΒ [13]:
%%sql

SELECT booster_version, launch_site,landing__outcome
FROM SPACEXTBL
WHERE landing__outcome = 'Failure (drone ship)'
  AND EXTRACT(YEAR FROM DATE) = 2015
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[13]:
booster_version launch_site landing__outcome
F9 v1.1 B1012 CCAFS LC-40 Failure (drone ship)
F9 v1.1 B1015 CCAFS LC-40 Failure (drone ship)

Task 10 ΒΆ

Rank the count of landing outcomes (such as Failure (drone ship) or Success (ground pad)) between the date 2010-06-04 and 2017-03-20, in descending order ΒΆ
InΒ [14]:
%%sql

SELECT landing__outcome, COUNT(*) AS "Count"
FROM SPACEXTBL
WHERE DATE BETWEEN '2010-06-04' and '2017-03-20'
GROUP BY landing__outcome
ORDER BY Count DESC
;
 * ibm_db_sa://qvc29638:***@764264db-9824-4b7c-82df-40d1b13897c2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:32536/bludb?authSource=admin&replicaSet=replset
Done.
Out[14]:
landing__outcome Count
No attempt 10
Failure (drone ship) 5
Success (drone ship) 5
Controlled (ocean) 3
Success (ground pad) 3
Failure (parachute) 2
Uncontrolled (ocean) 2
Precluded (drone ship) 1

Author(s) ΒΆ

Lakshmi Holla

Other Contributors ΒΆ

Rav Ahuja

Change log ΒΆ

Date Version Changed by Change Description
2021-10-12 0.4 Lakshmi Holla Changed markdown
2021-08-24 0.3 Lakshmi Holla Added library update
2021-07-09 0.2 Lakshmi Holla Changes made in magic sql
2021-05-20 0.1 Lakshmi Holla Created Initial Version

Β© IBM Corporation 2021. All rights reserved.